A Unified Morpho-Syntactic Scheme of Stanford Dependencies
نویسنده
چکیده
Stanford Dependencies (SD) provide a functional characterization of the grammatical relations in syntactic parse-trees. The SD representation is useful for parser evaluation, for downstream applications, and, ultimately, for natural language understanding, however, the design of SD focuses on structurally-marked relations and under-represents morphosyntactic realization patterns observed in Morphologically Rich Languages (MRLs). We present a novel extension of SD, called Unified-SD (U-SD), which unifies the annotation of structurallyand morphologically-marked relations via an inheritance hierarchy. We create a new resource composed of U-SDannotated constituency and dependency treebanks for the MRL Modern Hebrew, and present two systems that can automatically predict U-SD annotations, for gold segmented input as well as raw texts, with high baseline accuracy.
منابع مشابه
An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملUniversal Dependencies for Persian
The Persian Universal Dependency Treebank (Persian UD) is a recent effort of treebanking Persian with Universal Dependencies (UD), an ongoing project that designs unified and cross-linguistically valid grammatical representations including part-of-speech tags, morphological features, and dependency relations. The Persian UD is the converted version of the Uppsala Persian Dependency Treebank (UP...
متن کاملA Study on Morpho-Syntactic Patterns: A Cohesive Device in Some Persian Live Sport Radio and TV Talks
Morpho-syntactic patterns device encompasses a subcategory of the cohesive devices that assists hearers to have an adequate mental representation for understanding speech. This article investigates the morpho-syntactic patterns employed in some Persian live sport radio and TV programs adapting Dooley and Levinsohn’s theoretical and analytical framework. The research data includes around 30,000 ...
متن کاملMore Constructions, More Genres: Extending Stanford Dependencies
The Stanford dependency scheme aims to provide a simple and intuitive but linguistically sound way of annotating the dependencies between words in a sentence. In this paper, we address two limitations the scheme has suffered from: First, despite providing good coverage of core grammatical relations, the scheme has not offered explicit analyses of more difficult syntactic constructions; second, ...
متن کاملA Persian Treebank with Stanford Typed Dependencies
We present the Uppsala Persian Dependency Treebank (UPDT) with a syntactic annotation scheme based on Stanford Typed Dependencies. The treebank consists of 6,000 sentences and 151,671 tokens with an average sentence length of 25 words. The data is from different genres, including newspaper articles and fiction, as well as technical descriptions and texts about culture and art, taken from the op...
متن کامل